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Computing TCP’s Retransmission Timer 


Abstract 


This document defines the standard algorithm that Transmission 
Control Protocol (TCP) senders are required to use to compute and 
manage their retransmission timer. It expands on the discussion in 
Section 4.2.3.1 of RFC 1122 and upgrades the requirement of 
supporting the algorithm from a SHOULD to a MUST. This document 
obsoletes RFC 2988. 


Status of This Memo 
This is an Internet Standards Track document. 


This document is a product of the Internet Engineering Task Force 


(IETF). It represents the consensus of the IETF community. It has 
received public review and has been approved for publication by the 
Internet Engineering Steering Group (IESG). Further information on 


Internet Standards is available in Section 2 of RFC 5741. 
Information about the current status of this document, any errata, 


and how to provide feedback on it may be obtained at 
http://www.rfc-editor.org/info/rfc6298. 
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Copyright Notice 


Copyright (c) 2011 IETF Trust and the persons identified as the 
document authors. All rights reserved. 


This document is subject to BCP 78 and the IETF Trust’s Legal 
Provisions Relating to IETF Documents 
(http://trustee.ietf.org/license-info) in effect on the date of 
publication of this document. Please review these documents 
carefully, as they describe your rights and restrictions with respect 
to this document. Code Components extracted from this document must 
include Simplified BSD License text as described in Section 4.e of 
the Trust Legal Provisions and are provided without warranty as 
described in the Simplified BSD License. 


1. Introduction 


The Transmission Control Protocol (TCP) [Pos81] uses a retransmission 
timer to ensure data delivery in the absence of any feedback from the 
remote data receiver. The duration of this timer is referred to as 
RTO (retransmission timeout). RFC 1122 [Bra89] specifies that the 
RTO should be calculated as outlined in [Jac88]. 


This document codifies the algorithm for setting the RTO. In 
addition, this document expands on the discussion in Section 4.2.3.1 
of RFC 1122 and upgrades the requirement of supporting the algorithm 
from a SHOULD to a MUST. RFC 5681 [APBO9] outlines the algorithm TCP 
uses to begin sending after the RTO expires and a retransmission is 
sent. This document does not alter the behavior outlined in RFC 5681 
[APBO9]. 


In some situations, it may be beneficial for a TCP sender to be more 
conservative than the algorithms detailed in this document allow. 
However, a TCP MUST NOT be more aggressive than the following 
algorithms allow. This document obsoletes RFC 2988 [PA00]. 


The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", “SHALL NOT", 
"SHOULD", “SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 
document are to be interpreted as described in [Bra97]. 


2. The Basic Algorithm 
To compute the current RTO, a TCP sender maintains two state 
variables, SRTT (smoothed round-trip time) and RTTVAR (round-trip 


time variation). In addition, we assume a clock granularity of G 
seconds. 
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The rules governing the computation of SRIT, RITIVAR, and RTO are as 
follows: 


(2.1) Until a round-trip time (RTT) measurement has been made for a 
segment sent between the sender and receiver, the sender SHOULD 
set RTO <- 1 second, though the "backing off" on repeated 
retransmission discussed in (5.5) still applies. 


Note that the previous version of this document used an initial 
RTO of 3 seconds [PA00]. A TCP implementation MAY still use 
this value (or any other value > 1 second). This change in the 
lower bound on the initial RTO is discussed in further detail 
in Appendix A. 


(2.2) When the first RTT measurement R is made, the host MUST set 


SRTT <- R 
RTTVAR <- R/2 
RTO <- SRTT + max (G, K*RTTVAR) 


where K = 4. 
(2.3) When a subsequent RTT measurement R’ is made, a host MUST set 


RTTVAR <- (1 - beta) * RTTVAR + beta * | SRTT SFR? 
SRTT <- (1 - alpha) * SRTT + alpha * R’ 


The value of SRTT used in the update to RTTVAR is its value 
before updating SRTT itself using the second assignment. That 
is, updating RTTVAR and SRTT MUST be computed in the above 
order. 


The above SHOULD be computed using alpha=1/8 and beta=1/4 (as 
suggested in [JK88]). 


After the computation, a host MUST update 
RTO <- SRTT + max (G, K*RTTVAR) 


(2.4) Whenever RTO is computed, if it is less than 1 second, then the 
RTO SHOULD be rounded up to 1 second. 


Traditionally, TCP implementations use coarse grain clocks to 
measure the RTT and trigger the RTO, which imposes a large 
minimum value on the RTO. Research suggests that a large 
minimum RTO is needed to keep TCP conservative and avoid 
spurious retransmissions [AP99]. Therefore, this specification 
requires a large minimum RTO as a conservative approach, while 
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at the same time acknowledging that at some future point, 
research may show that a smaller minimum RTO is acceptable or 
superior. 


(2.5) A maximum value MAY be placed on RTO provided it is at least 60 
seconds. 


3. Taking RTT Samples 


TCP MUST use Karn’s algorithm [KP87] for taking RIT samples. That 
is, RTT samples MUST NOT be made using segments that were 
retransmitted (and thus for which it is ambiguous whether the reply 
was for the first instance of the packet or a later instance). The 
only case when TCP can safely take RTT samples from retransmitted 
segments is when the TCP timestamp option [JBB92] is employed, since 
the timestamp option removes the ambiguity regarding which instance 
of the data segment triggered the acknowledgment. 


Traditionally, TCP implementations have taken one RIT measurement at 
a time (typically, once per RTT). However, when using the timestamp 
option, each ACK can be used as an RTT sample. RFC 1323 [JBB92] 
suggests that TCP connections utilizing large congestion windows 
should take many RTT samples per window of data to avoid aliasing 
effects in the estimated RIT. A TCP implementation MUST take at 
least one RTT measurement per RTT (unless that is not possible per 
Karn’s algorithm). 


For fairly modest congestion window sizes, research suggests that 
timing each segment does not lead to a better RTT estimator [AP99]. 
Additionally, when multiple samples are taken per RIT, the alpha and 
beta defined in Section 2 may keep an inadequate RTT history. A 
method for changing these constants is currently an open research 
question. 


4. Clock Granularity 
There is no requirement for the clock granularity G used for 
computing RIT measurements and the different state variables. 
However, if the K*RTTVAR term in the RTO calculation equals zero, the 
variance term MUST be rounded to G seconds (i.e., use the equation 
given in step 2.3). 


RTO <- SRTT + max (G, K*RTITVAR) 


Experience has shown that finer clock granularities (<= 100 msec) 
perform somewhat better than coarser granularities. 
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Note that [Jac88] outlines several clever tricks that can be used to 
obtain better precision from coarse granularity timers. These 
changes are widely implemented in current TCP implementations. 


5. Managing the RTO Timer 


An implementation MUST manage the retransmission timer(s) in such a 
way that a segment is never retransmitted too early, i.e., less than 
one RTO after the previous transmission of that segment. 


The following is the RECOMMENDED algorithm for managing the 
retransmission timer: 


(5.1) Every time a packet containing data is sent (including a 
retransmission), if the timer is not running, start it running 
so that it will expire after RTO seconds (for the current value 
of RTO). 


(5.2) When all outstanding data has been acknowledged, turn off the 
retransmission timer. 


(5.3) When an ACK is received that acknowledges new data, restart the 
retransmission timer so that it will expire after RTO seconds 
(for the current value of RTO). 


When the retransmission timer expires, do the following: 


(5.4) Retransmit the earliest segment that has not been acknowledged 
by the TCP receiver. 


(5.5) The host MUST set RTO <- RTO * 2 ("back off the timer"). The 
maximum value discussed in (2.5) above may be used to provide 
an upper bound to this doubling operation. 


(5.6) Start the retransmission timer, such that it expires after RTO 
seconds (for the value of RTO after the doubling operation 
outlined in 5.5). 


(5.7) If the timer expires awaiting the ACK of a SYN segment and the 
TCP implementation is using an RTO less than 3 seconds, the RTO 
MUST be re-initialized to 3 seconds when data transmission 
begins (i.e., after the three-way handshake completes). 


This represents a change from the previous version of this 
document [PA00] and is discussed in Appendix A. 
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Note that after retransmitting, once a new RIT measurement is 
obtained (which can only happen when new data has been sent and 
acknowledged), the computations outlined in Section 2 are performed, 
including the computation of RTO, which may result in "collapsing" 
RTO back down after it has been subject to exponential back off (rule 
335) 


Note that a TCP implementation MAY clear SRTT and RTTVAR after 
backing off the timer multiple times as it is likely that the current 
SRTT and RTTVAR are bogus in this situation. Once SRTT and RTTVAR 
are cleared, they should be initialized with the next RTT sample 
taken per (2.2) rather than using (2.3). 


6. Security Considerations 


This document requires a TCP to wait for a given interval before 
retransmitting an unacknowledged segment. An attacker could cause a 
TCP sender to compute a large value of RTO by adding delay to a timed 
packet’s latency, or that of its acknowledgment. However, the 
ability to add delay to a packet’s latency often coincides with the 
ability to cause the packet to be lost, so it is difficult to see 
what an attacker might gain from such an attack that could cause more 
damage than simply discarding some of the TCP connection’s packets. 


The Internet, to a considerable degree, relies on the correct 
implementation of the RTO algorithm (as well as those described in 
RFC 5681) in order to preserve network stability and avoid congestion 
collapse. An attacker could cause TCP endpoints to respond more 
aggressively in the face of congestion by forging acknowledgments for 
segments before the receiver has actually received the data, thus 
lowering RTO to an unsafe value. But to do so requires spoofing the 
acknowledgments correctly, which is difficult unless the attacker can 
monitor traffic along the path between the sender and the receiver. 
In addition, even if the attacker can cause the sender’s RTO to reach 
too small a value, it appears the attacker cannot leverage this into 
much of an attack (compared to the other damage they can do if they 
can spoof packets belonging to the connection), since the sending TCP 
will still back off its timer in the face of an incorrectly 
transmitted packet’s loss due to actual congestion. 


The security considerations in RFC 5681 [APB09] are also applicable 
to this document. 
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7. Changes from RFC 2988 


This document reduces the initial RTO from the previous 3 seconds 
[PA00] to 1 second, unless the SYN or the ACK of the SYN is lost, in 
which case the default RTO is reverted to 3 seconds before data 
transmission begins. 
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Appendix A. Rationale for Lowering the Initial RTO 


Choosing a reasonable initial RTO requires balancing two competing 
considerations: 


The initial RTO should be sufficiently large to cover most of the 
end-to-end paths to avoid spurious retransmissions and their 
associated negative performance impact. 


The initial RTO should be small enough to ensure a timely recovery 
from packet loss occurring before an RTT sample is taken. 


Traditionally, TCP has used 3 seconds as the initial RTO [Bra89] 
[PA00]. This document calls for lowering this value to 1 second 
using the following rationale: 


Modern networks are simply faster than the state-of-the-art was at 
the time the initial RTO of 3 seconds was defined. 


Studies have found that the round-trip times of more than 97.5% of 
the connections observed in a large scale analysis were less than 1 
second [Chu09], suggesting that 1 second meets criterion 1 above. 


In addition, the studies observed retransmission rates within the 
three-way handshake of roughly 2%. This shows that reducing the 
initial RTO has benefit to a non-negligible set of connections. 


However, roughly 2.5% of the connections studied in [Chu09] have an 
RTT longer than 1 second. For those connections, a 1 second 
initial RTO guarantees a retransmission during connection 
establishment (needed or not). 


When this happens, this document calls for reverting to an initial 
RTO of 3 seconds for the data transmission phase. Therefore, the 
implications of the spurious retransmission are modest: (1) an 
extra SYN is transmitted into the network, and (2) according to RFC 
5681 [APB09] the initial congestion window will be limited to 1 
segment. While (2) clearly puts such connections at a 
disadvantage, this document at least resets the RTO such that the 
connection will not continually run into problems with a short 
timeout. (Of course, if the RTT is more than 3 seconds, the 
connection will still encounter difficulties. But that is not a 
new issue for TCP.) 


In addition, we note that when using timestamps, TCP will be able 
to take an RIT sample even in the presence of a spurious 
retransmission, facilitating convergence to a correct RIT estimate 
when the RTT exceeds 1 second. 
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As an additional check on the results presented in [Chu09], we 
analyzed packet traces of client behavior collected at four different 
vantage points at different times, as follows: 


Name Dates Pkts. Cnns. Clnts. Servs. 
LBL-1 Oct /05--Mar/06 292M 242K 228 74K 
LBL-2 Nov/09--Feb/10 1.1B 1.2M 1047 38K 
ICSI-1 Sep/11--18/07 137M 2.1M 193 486K 
ICSI-2 Sep/11--18/08 163M 1.9M 177 277K 
ICSI-3 Sep/14--21/09 334M 3.1M 170 253K 
ICSI-4 Sep/11--18/10 298M 5M 183 189K 
Dartmouth Jan/4--21/04 1B 4M 3782 132K 
SIGCOMM Aug/17--21/08 11.6M 133K 152 29K 


The "LBL" data was taken at the Lawrence Berkeley National 
Laboratory, the "ICSI" data from the International Computer Science 
Institute, the "SIGCOMM" data from the wireless network that served 
the attendees of SIGCOMM 2008, and the "Dartmouth" data was collected 
from Dartmouth College’s wireless network. The latter two datasets 
are available from the CRAWDAD data repository [HKA04] [SLS09]. The 
table lists the dates of the data collections, the number of packets 
collected, the number of TCP connections observed, the number of 
local clients monitored, and the number of remote servers contacted. 
We consider only connections initiated near the tracing vantage 
point. 


Analysis of these datasets finds the prevalence of retransmitted SYNs 
to be between 0.03% (ICSI-4) to roughly 2% (LBL-1 and Dartmouth). 


We then analyzed the data to determine the number of additional and 
spurious retransmissions that would have been incurred if the initial 
RTO was assumed to be 1 second. In most of the datasets, the 
proportion of connections with spurious retransmits was less than 
0.1%. However, in the Dartmouth dataset, approximately 1.1% of the 
connections would have sent a spurious retransmit with a lower 
initial RTO. We attribute this to the fact that the monitored 
network is wireless and therefore susceptible to additional delays 
from RF effects. 


Finally, there are obviously performance benefits from retransmitting 
lost SYNs with a reduced initial RTO. Across our datasets, the 
percentage of connections that retransmitted a SYN and would realize 
at least a 10% performance improvement by using the smaller initial 
RTO specified in this document ranges from 43% (LBL-1) to 87% 
(ICSI-4). The percentage of connections that would realize at least 
a 50% performance improvement ranges from 17% (ICSI-1 and SIGCOMM) to 
73% (ICSI-4). 
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From the data to which we have access, we conclude that the lower 


initial RTO is likely to be beneficial to many connections, 


harmful to relatively few. 
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